Open In Colab

Exploratory Data Analysis¶

Importing processed data¶

date adj_close close high low open volume ticker revenues cost_of_goods ... current_ratio debt_to_equity_ratio ebitda_margin gross_margin net_income_margin dividend_yield payout_ratio return_on_assets return_on_equity return_on_capital
0 2010-04-01 7.108997 8.427500 8.526071 8.312500 8.478929 603145200 AAPL 13499000.0 7874000.0 ... 2.644206 0.450061 0.314468 15.785443 0.227721 0.0 0.0 0.054664 0.078123 0.089877
1 2010-04-05 7.184915 8.517500 8.518214 8.384643 8.392143 684507600 AAPL 13499000.0 7874000.0 ... 2.644206 0.450061 0.314468 15.785443 0.227721 0.0 0.0 0.054664 0.078123 0.089877
2 2010-04-06 7.216546 8.555000 8.580000 8.464286 8.507143 447017200 AAPL 13499000.0 7874000.0 ... 2.644206 0.450061 0.314468 15.785443 0.227721 0.0 0.0 0.054664 0.078123 0.089877
3 2010-04-07 7.248483 8.592857 8.640000 8.523571 8.555357 628502000 AAPL 13499000.0 7874000.0 ... 2.644206 0.450061 0.314468 15.785443 0.227721 0.0 0.0 0.054664 0.078123 0.089877
4 2010-04-08 7.228898 8.569643 8.626429 8.501429 8.587143 572989200 AAPL 13499000.0 7874000.0 ... 2.644206 0.450061 0.314468 15.785443 0.227721 0.0 0.0 0.054664 0.078123 0.089877

5 rows × 106 columns

Plot close price¶

This code is used to visualize the historical closing prices of Apple Inc. (AAPL) stock over time. It first converts the date column in the DataFrame to datetime objects, ensuring proper handling of time-based data. Then, a line plot is generated with the date on the x-axis and the closing prices (close) on the y-axis. The plot is labeled with "date" for the x-axis and "Close Price" for the y-axis, and the title "AAPL Close Price" is added to the chart. Finally, the plot is displayed to help identify trends, fluctuations, and patterns in the stock's performance over the selected time period.

No description has been provided for this image
No description has been provided for this image

Normalized Net Income¶

  • Normalize 'Close' and 'Net Income' Columns

    • The close and net_income columns are normalized using Min-Max scaling, which transforms the values to a range between 0 and 1:
      • normalized_close: Scales the close prices.
      • normalized_net_income: Scales the net_income values.
  • Plot Normalized Data

    • A plot is created with the date on the x-axis and the normalized values (normalized_close and normalized_net_income) on the y-axis.
    • The line for normalized_close is labeled as "Normalized Close Price," and the line for normalized_net_income is labeled as "Normalized Net Income."
  • Add Labels and Title

    • The x-axis is labeled as "Date," and the y-axis is labeled as "Normalized Value."
    • The title of the plot is set to "Normalized Close Price vs. Normalized Net Income."
  • Add Grid and Legend

    • A grid is added for better readability, and a legend is included to differentiate the two lines in the plot.
  • Display the Plot

    • plt.show() is called to display the final plot.
No description has been provided for this image
No description has been provided for this image

Calculate Moving Average¶

  • Define Moving Average Period

    • mavgd is set to 30, specifying the number of days for the moving average calculation.
  • Calculate Moving Average

    • The function calculate_moving_average() computes the moving average for the 'close' prices using a rolling window of size window. It creates a new column in the DataFrame, labeled {window}_DMA, representing the moving average for the specified window.
  • Apply Moving Average Calculation

    • The function is applied to the DataFrame df with the moving average period (mavgd = 30), creating a new column called 30_DMA containing the 30-day moving average of the 'close' price.
  • Plot Close Price and Moving Average

    • A plot is created with the date on the x-axis and both the close price and the calculated moving average (30_DMA) on the y-axis.
    • The plot displays two lines: one for the "Close Price" and another for the "30-Day Moving Average."
  • Add Labels and Title

    • The x-axis is labeled as "Date," and the y-axis is labeled as "Price."
    • The title of the plot is dynamically set to "AAPL Close Price and 30-Day Moving Average."
  • Add Grid and Legend

    • A grid is added to the plot for better readability, and a legend is included to differentiate between the "Close Price" and "30-Day Moving Average."
  • Display the Plot

    • plt.show() is called to display the final plot.

Label Generation¶

  • Define Moving Average Period for Label Generation

    • labels_moving_average_days is set to 10, specifying the number of days for the rolling average calculation used to generate the buy/sell signals.
  • Calculate Moving Average

    • The 10-day rolling average of the close price is calculated using the rolling(window=labels_moving_average_days) method, and a new column, 10_Day_Avg, is added to the DataFrame to store this value.
  • Shift the Moving Average

    • The 10_Day_Avg column is shifted down by 10 days (using .shift(-labels_moving_average_days)) to compare each day's close price with the next 10-day moving average. This shifted average is stored in a new column, Next_10_Day_Avg.
  • Generate Buy/Sell Signals

    • A new column, signal, is created and initialized with a default value of "SELL."
    • A condition is applied to identify where the close price is lower than the Next_10_Day_Avg. For these rows, the signal column is updated to "BUY."

Rational for Choosing this as target variable¶

  • Difficult to predict prices in stock market due to various factors and variations.
  • Wile we can predict the trend up or down for a short or long period of time based on trend like Moving Average Comparision but this are lagging indicators.
  • These indicators smoothens the prices but due to lagging indicators all the action has already been done.
  • If we can predict the moving average cross with current close price before hand x days then we can participate in the trend early and get some benifits.
  • The model will learn to predict these intersections beforehand.
  • During training, we're essentially teaching the model to recognize patterns that lead to these intersections.

This code is used to create a simple trading strategy based on comparing a stock's close price with its future moving average to generate buy and sell signals.

Plotting the target variable with closing price and moving average¶

No description has been provided for this image

Scatter Plot for BUY/SELL signals¶

  • Purpose of Buy/Sell Signals
    The code generates buy and sell signals based on the comparison between the current stock price and its moving average. These signals can be used to develop a simple trading strategy:

    • BUY: When the current price is below the future moving average, it suggests a potential upward movement, and a "BUY" signal is generated.
    • SELL: When the current price is above the future moving average, it indicates that the price might decline, and a "SELL" signal is generated.
  • Plotting Buy and Sell Signals
    The plot visualizes these buy and sell signals on the stock's price chart:

    • Green Circles (BUY Signals): Represent points where the stock price is below the expected moving average, suggesting a buying opportunity.
    • Red Circles (SELL Signals): Represent points where the stock price is above the expected moving average, indicating a potential selling point.

What the Plot Shows¶

  • The plot displays the stock’s closing price over time, with green dots indicating where the algorithm suggests buying the stock, and red dots marking suggested sell points.
  • The x-axis represents the date, showing the timeline over which these decisions were made, while the y-axis represents the close price of the stock.
  • The combination of the price trend and these signals can help visualize potential entry and exit points for a trading strategy based on historical price movements and moving averages.
No description has been provided for this image

Find Cumulative Profit¶

The code simulates a simple trading strategy where buy and sell signals are used to track the cumulative profit of a stock position. The goal is to calculate the profit from buying and selling stocks based on a series of buy and sell signals.

  • Initialization:
    The following variables are initialized to track the trading process:

    • stock_quantity: Tracks how many stocks are held at any given point in time.
    • total_buy_price: Tracks the total price spent on buying stocks.
    • stock_profit: Tracks the profit made from each sell transaction.
    • quantity and buy_price are set to 0 initially, representing no stock purchased.
  • Iterating Through the Data:
    The code then iterates over each row of the DataFrame (df), simulating trading based on buy ('BUY') and sell ('SELL') signals.

    • When the signal is 'BUY':

      • The quantity of stocks held (quantity) is increased by 1 (simulating the purchase of one stock).
      • The total buy price (buy_price) is updated by adding the stock’s closing price.
      • The DataFrame is updated to reflect the current quantity of stock and total purchase price.
    • When the signal is 'SELL' (and there are stocks to sell):

      • The profit from selling is calculated as:

        Profit = (quantity * current close price) - total buy price

      • After selling, the stock quantity and total buy price are reset to 0, and the profit for that sell transaction is recorded in the DataFrame.

    • When there is no buy or sell signal ('Hold' condition):

      • If no action is taken (i.e., when the signal is not 'BUY' or 'SELL'), the stock holdings and total buy price remain unchanged.
  • Cumulative Profit Calculation:
    After processing the signals, the cumulative profit is calculated using cumsum(), which returns the running total of profits up to each point in time. The cumulative profit reflects the overall performance of the simulated trading strategy up to each date in the DataFrame.

  • This simulation calculates the cumulative profit or loss for a strategy based on buy and sell signals.

  • Cumulative Profit: The final column, cumulative_profit, tracks the total profit from all completed buy/sell transactions as the algorithm progresses through the dataset. This gives a clear picture of how much profit would have been accumulated over time based on the trading signals provided.

date adj_close close high low open volume ticker revenues cost_of_goods ... normalized_net_income 30_DMA 10_DMA 10_Day_Avg Next_10_Day_Avg signal stock_quantity total_buy_price stock_profit cumulative_profit
0 2010-04-01 7.108997 8.427500 8.526071 8.312500 8.478929 603145200 AAPL 13499000.0 7874000.0 ... 0.000000 8.427500 8.427500 NaN 8.668214 BUY 1.0 8.427500 0.0 0.000000
1 2010-04-05 7.184915 8.517500 8.518214 8.384643 8.392143 684507600 AAPL 13499000.0 7874000.0 ... 0.000000 8.472500 8.472500 NaN 8.698857 BUY 2.0 16.945000 0.0 0.000000
2 2010-04-06 7.216546 8.555000 8.580000 8.464286 8.507143 447017200 AAPL 13499000.0 7874000.0 ... 0.000000 8.500000 8.500000 NaN 8.716893 BUY 3.0 25.500000 0.0 0.000000
3 2010-04-07 7.248483 8.592857 8.640000 8.523571 8.555357 628502000 AAPL 13499000.0 7874000.0 ... 0.000000 8.523214 8.523214 NaN 8.783393 BUY 4.0 34.092857 0.0 0.000000
4 2010-04-08 7.228898 8.569643 8.626429 8.501429 8.587143 572989200 AAPL 13499000.0 7874000.0 ... 0.000000 8.532500 8.532500 NaN 8.878107 BUY 5.0 42.662500 0.0 0.000000
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2702 2020-12-23 128.059906 130.960007 132.429993 130.779999 132.160004 88223700 AAPL 58313000.0 35943000.0 ... 0.426626 122.093333 126.955000 126.955000 NaN SELL 0.0 0.000000 0.0 4631.666698
2703 2020-12-24 129.047501 131.970001 133.460007 131.100006 131.320007 54930100 AAPL 58313000.0 35943000.0 ... 0.426626 122.509333 127.828001 127.828001 NaN SELL 0.0 0.000000 0.0 4631.666698
2704 2020-12-28 133.662994 136.690002 137.339996 133.509995 133.990005 124486200 AAPL 58313000.0 35943000.0 ... 0.426626 123.092000 129.256001 129.256001 NaN SELL 0.0 0.000000 0.0 4631.666698
2705 2020-12-29 131.883286 134.869995 138.789993 134.339996 138.050003 121047300 AAPL 58313000.0 35943000.0 ... 0.426626 123.612333 130.565000 130.565000 NaN SELL 0.0 0.000000 0.0 4631.666698
2706 2020-12-30 130.758759 133.720001 135.990005 133.399994 135.580002 96452100 AAPL 58313000.0 35943000.0 ... 0.426626 124.059666 131.149001 131.149001 NaN SELL 0.0 0.000000 0.0 4631.666698

2707 rows × 117 columns

2
Number of profitable trades: 183
Number of losing trades: 2
Number of neutral trades: 2522
Total Profit: 4631.67
Average Profit per Trade: 25.04
Max Profit: 448.06
Max Loss: -0.09
Profit Standard Deviation: 17.34
No description has been provided for this image

Correlation Analysis¶

10_DMA                        0.998379
cumulative_profit             0.957153
debt_to_equity_ratio          0.909119
price_to_book_value           0.891101
other_assets                  0.880511
market_capitalization         0.868078
research_&_development        0.861709
enterprise_valuation          0.860277
common_stock                  0.853777
property_plant_&_equipment    0.848398
Name: close, dtype: float64
No description has been provided for this image

Principal Component Analysis¶

Number of components needed for 99% variance explained: 4
((2707, 102), (2707, 103))
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image